Probabilistic Matrix Factorization with Non-random Missing Data
نویسندگان
چکیده
We propose a probabilistic matrix factorization model for collaborative filtering that learns from data that is missing not at random (MNAR). Matrix factorization models exhibit state-of-the-art predictive performance in collaborative filtering. However, these models usually assume that the data is missing at random (MAR), and this is rarely the case. For example, the data is not MAR if users rate items they like more than ones they dislike. When the MAR assumption is incorrect, inferences are biased and predictive performance can suffer. Therefore, we model both the generative process for the data and the missing data mechanism. By learning these two models jointly we obtain improved performance over state-of-the-art methods when predicting the ratings and when modeling the data observation process. We present the first viable MF model for MNAR data. Our results are promising and we expect that further research on NMAR models will yield large gains in collaborative filtering.
منابع مشابه
Bayesian Matrix Factorization with Non-Random Missing Data using Informative Gaussian Process Priors and Soft Evidences
We propose an extended Bayesian matrix factorization method, which can incorporate multiple sources of side information, combine multiple a priori estimates for the missing data and integrates a flexible missing not at random submodel. The model is formalized as probabilistic graphical model and a corresponding Gibbs sampling scheme is derived to perform unrestricted inference. We discuss the a...
متن کاملA new approach for building recommender system using non negative matrix factorization method
Nonnegative Matrix Factorization is a new approach to reduce data dimensions. In this method, by applying the nonnegativity of the matrix data, the matrix is decomposed into components that are more interrelated and divide the data into sections where the data in these sections have a specific relationship. In this paper, we use the nonnegative matrix factorization to decompose the user ratin...
متن کاملCoupled Compound Poisson Factorization
We present a general framework, the coupled compound Poisson factorization (CCPF), to capture the missing-data mechanism in extremely sparse data sets by coupling a hierarchical Poisson factorization with an arbitrary data-generating model. We derive a stochastic variational inference algorithm for the resulting model and, as examples of our framework, implement three different data-generating ...
متن کاملNetwork Coordination Algorithm Based on Distributed Probability Matrix Factorization
Network Coordination Systems (NCS) are powerful mechanisms to efficiently predict the network latency of pairwise hosts in Internet without directly network measurement between them. Each host is embedded into a distance space where the latency between nodes is calculated according to the distant function defined by the space with assigned coordination of hosts. In terms of the properties of di...
متن کاملMulti-HDP: A Non Parametric Bayesian Model for Tensor Factorization
Matrix factorization algorithms are frequently used in the machine learning community to find low dimensional representations of data. We introduce a novel generative Bayesian probabilistic model for unsupervised matrix and tensor factorization. The model consists of several interacting LDA models, one for each modality. We describe an efficient collapsed Gibbs sampler for inference. We also de...
متن کامل